Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload

Reply to this post | Go Back
View Post [edit]

Poster: htfiddler Date: Sep 9, 2008 3:57pm
Forum: web Subject: Number of Pages in Site?

Hi ... confused here again, but different topic.

So all these links we see point to the top page, ie. home page of a specific URL ... right? Well, the site I'm
looking at contains hundreds of internal pages. For instance,

http://web.archive.org/web/*/http://www.dieoff.org

contains hundreds of pages. Here is the Dec 28, 2006 crawl:

http://web.archive.org/web/20061228040340/http://dieoff.org/

and here is a page linked to from above page, apparently crawled about 3 months earlier on Sep 27, 2006:

http://web.archive.org/web/20060927090516/dieoff.org/page193.htm

So far, so good. Now, lets say this site has 100 internal pages and 2 weeks later the webmaster makes no changes
(or minor ones) to the home page, but either adds or deletes 80 internal pages to/from the site. As I understand it, on the next crawl, we will see the asterisk (showing site was updated), we will see the identical or slightly updated home page, but will get no indication at all that the overall site has undergone such massive changes. Is that correct ... is this how Wayback currently works?

If I do understand correctly, then there is apparently no way for me to easily track the overall size of my example site (dieoff.org) over the 11 yrs of its existence. I would like to suggest that perhaps Wayback could display number of pages, number of total distinct files (including gif/jpg), number of megabytes, to help us spot the major updates/changes over time.

If I'm correct here, how would I make this suggestion? Do I just email info@archive.org?

Thanks,
Confused.

Reply to this post
Reply [edit]

Poster: kustota Date: Sep 9, 2008 7:38pm
Forum: web Subject: Re: Number of Pages in Site?

you might find advanced search helpful:
http://web.archive.org/collections/web/advanced.html
for instance
http://web.archive.org/web/200401-200402*re_/http://www.dieoff.org*
will give you all pages from jan-feb 2004.
you should keep in mind that sites are not always archived in their whole, sometimes pages and images are missing.

Reply to this post
Reply [edit]

Poster: Face_ Date: Sep 13, 2008 4:27am
Forum: web Subject: Re: Number of Pages in Site?

Indeed, which, in my honest opinion, sucks big time.